perf: Replace git subprocesses with libgit2 and skip unnecessary summary work#11938
Merged
anthonyshew merged 2 commits intomainfrom Feb 20, 2026
Merged
perf: Replace git subprocesses with libgit2 and skip unnecessary summary work#11938anthonyshew merged 2 commits intomainfrom
libgit2 and skip unnecessary summary work#11938anthonyshew merged 2 commits intomainfrom
Conversation
Contributor
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
Contributor
Coverage Report
|
5635ecb to
de4e268
Compare
libgit2 and skip unnecessary summary work
Eliminate fork+exec overhead for the three hottest git subprocess calls in turbo run startup: WorktreeInfo::detect, git ls-tree, and git status. Replace with in-process libgit2 equivalents (Repository::discover, tree.walk, repo.statuses). Also skip expensive TaskSummary construction and SCMState::get when neither --dry nor --summarize is set, and use sorted Vec instead of BTreeMap for ls-tree results for better cache locality.
de4e268 to
2b40c2e
Compare
…py octal escape warnings
github-actions bot
added a commit
that referenced
this pull request
Feb 20, 2026
## Release v2.8.11-canary.15 Versioned docs: https://v2-8-11-canary-15.turborepo.dev ### Changes - release(turborepo): 2.8.11-canary.14 (#11939) (`59866f5`) - perf: Replace git subprocesses with `libgit2` and skip unnecessary summary work (#11938) (`58e3c00`) --------- Co-authored-by: Turbobot <turbobot@vercel.com>
anthonyshew
added a commit
that referenced
this pull request
Feb 20, 2026
…ses (#11942) ## Summary Follow-up to #11938. Targets the per-package hashing hot path that dominates at scale, plus eliminates the last two git subprocesses from `--dry` runs. ### Small repo (~6 packages) | | Mean | Range | |---|---|---| | **This PR** | 571.2ms ± 46.7ms | 515.6ms - 651.7ms | | **main** | 587.4ms ± 45.1ms | 524.9ms - 676.3ms | | | **1.03 ± 0.12x faster** | | ### Medium repo (~120 packages) | | Mean | Range | |---|---|---| | **This PR** | 1.096s ± 0.095s | 1.015s - 1.280s | | **main** | 1.119s ± 0.072s | 1.042s - 1.259s | | | **1.02 ± 0.11x faster** | | ### Large repo (~1000 packages) | | Mean | Range | |---|---|---| | **This PR** | 1.729s ± 0.151s | 1.548s - 1.969s | | **main** | 1.833s ± 0.181s | 1.583s - 2.099s | | | **1.06 ± 0.14x faster** | | The small repo results best isolate the fixed-cost improvements (git2 for branch/SHA, reduced allocation overhead) since per-package work is minimal. At larger scales, the improvements are present but within noise because wall-clock time is already well-parallelized across rayon threads. ## Benchmarks All benchmarks: `turbo run <task> --skip-infer --dry`, 5 warmup + 10 measured runs, release build. ## Changes - **FileHashes: HashMap to sorted Vec** — `FileHashes` inner type changed from `HashMap` to pre-sorted `Vec`. Eliminates HashMap construction (hashing, bucket allocation, rehashing) in the per-package hashing pipeline and removes redundant re-sorting in Cap'n Proto serialization. The sort happens once at the construction boundary; downstream consumers (`expanded_inputs`, `.hash()`) get pre-sorted data for free. - **Status entry binary search** — `get_package_hashes` now uses `partition_point` on pre-sorted status entries instead of a linear scan. Reduces per-package status lookup from O(dirty_files) to O(log(dirty_files) + matched). Also adds `with_capacity` to the per-package HashMap to avoid rehashing. - **git2 for branch/SHA** — `get_current_branch` and `get_current_sha` (called by `SCMState::get` in `to_summary`) now use `git2::Repository` instead of forking `git branch --show-current` and `git rev-parse HEAD`. Gated behind `#[cfg(feature = "git2")]` with subprocess fallback.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
WorktreeInfo::detect,git ls-tree,git status) with in-process libgit2 equivalents, eliminating fork+exec overheadTaskSummaryconstruction andSCMState::get(2 git subprocesses) when neither--drynor--summarizeis setVecwithpartition_pointinstead ofBTreeMapfor ls-tree results (better cache locality)BTreeMapdirectly fromHashTrackerInfo::expanded_inputstrait, eliminating an intermediateHashMapcloneSmall repo (~6 packages):
Medium repo (~120 packages):
Large repo (~1000 packages):
This is a small fixed-cost for all repos, so we don't expect to be able to see these improvements as easily the larger the repo gets.
Measured with
hyperfinecomparingmain(Benchmark 2) vs this branch (Benchmark 1),--warmup 5, 10 runs each. All runs use--skip-infer --dry.Note: the
--dryflag means the lazy summary optimization doesn't apply in these benchmarks. Realturbo runinvocations (without--dryor--summarize) will see additional savings from skippingTaskSummaryconstruction andSCMState::get.